19 research outputs found
Approximation Results for Gradient Descent trained Neural Networks
The paper contains approximation guarantees for neural networks that are
trained with gradient flow, with error measured in the continuous
-norm on the -dimensional unit sphere and targets
that are Sobolev smooth. The networks are fully connected of constant depth and
increasing width. Although all layers are trained, the gradient flow
convergence is based on a neural tangent kernel (NTK) argument for the
non-convex second but last layer. Unlike standard NTK analysis, the continuous
error norm implies an under-parametrized regime, possible by the natural
smoothness assumption required for approximation. The typical
over-parametrization re-enters the results in form of a loss in approximation
rate relative to established approximation methods for Sobolev smooth
functions
Double Greedy Algorithms: Reduced Basis Methods for Transport Dominated Problems
The central objective of this paper is to develop reduced basis methods for
parameter dependent transport dominated problems that are rigorously proven to
exhibit rate-optimal performance when compared with the Kolmogorov -widths
of the solution sets. The central ingredient is the construction of
computationally feasible "tight" surrogates which in turn are based on deriving
a suitable well-conditioned variational formulation for the parameter dependent
problem. The theoretical results are illustrated by numerical experiments for
convection-diffusion and pure transport equations. In particular, the latter
example sheds some light on the smoothness of the dependence of the solutions
on the parameters
Performance bounds for Reduced Order Models with Application to Parametric Transport
The Kolmogorov -width is an established benchmark to judge the performance
of reduced basis and similar methods that produce linear reduced spaces.
Although immensely successful in the elliptic regime, this width, shows
unsatisfactory slow convergence rates for transport dominated problems. While
this has triggered a large amount of work on nonlinear model reduction
techniques, we are lacking a benchmark to evaluate their optimal performance.
Nonlinear benchmarks like manifold/stable/Lipschitz width applied to the
solution manifold are often trivial if the degrees of freedom exceed the
parameter dimension and ignore desirable structure as offline/online
decompositions. In this paper, we show that the same benchmarks applied to the
full reduced order model pipeline from PDE to parametric quantity of interest
provide non-trivial benchmarks and we prove lower bounds for transport
equations
Approximation results for Gradient Descent trained Shallow Neural Networks in
Two aspects of neural networks that have been extensively studied in the
recent literature are their function approximation properties and their
training by gradient descent methods. The approximation problem seeks accurate
approximations with a minimal number of weights. In most of the current
literature these weights are fully or partially hand-crafted, showing the
capabilities of neural networks but not necessarily their practical
performance. In contrast, optimization theory for neural networks heavily
relies on an abundance of weights in over-parametrized regimes.
This paper balances these two demands and provides an approximation result
for shallow networks in with non-convex weight optimization by gradient
descent. We consider finite width networks and infinite sample limits, which is
the typical setup in approximation theory. Technically, this problem is not
over-parametrized, however, some form of redundancy reappears as a loss in
approximation rate compared to best possible rates
Adaptive anisotropic Petrov-Galerkin methods for first order transport equations
This paper builds on recent developments of adaptive methods for linear transport equations based on certain stable variational formulations of Petrov–Galerkin type. The key issues can be summarized as follows. The variational formulations allow us to employ meshes with cells of arbitrary aspect ratios. We develop a refinement scheme generating highly anisotropic partitions that is inspired by shearlet systems. We establish L2 approximation rates for N-term approximations from corresponding piecewise polynomials for certain compact cartoon classes of functions. In contrast to earlier results in a curvelet or shearlet context the cartoon classes are concisely defined through certain characteristic parameters and the dependence of the approximation rates on these parameters is made explicit here. The approximation rate results serve then as a benchmark for subsequent applications to adaptive Galerkin solvers for transport equations. We outline a new class of directionally adaptive, Petrov–Galerkin discretizations for such equations. In numerical experiments, the new algorithms track C2 discontinuity curves stably and accurately, and realize essentially optimal rates. Finally, we treat parameter dependent transport problems, which arise in kinetic models as well as in radiative transfer. In heterogeneous media these problems feature propagation of singularities along curved characteristics precluding, in particular, fast marching methods based on ray-tracing. Since now the solutions are functions of spatial variables and parameters one has to address the curse of dimensionality. As for non-adaptive schemes considered in Grella and Schwab (2011) and Grella (2014), we show computationally, for a model parametric transport problem in heterogeneous media in 2+1 dimension, that sparse tensorization of the presently proposed directionally adaptive variational discretization scheme in physical space combined with hierarchic collocation in ordinate space can overcome the curse of dimensionality when approximating averaged bulk quantities